2 CONFIDENCE FOR CHANGE-POINTS
a contribution to the methodological side but also presents real data applications
stories. Our methods aim at spotting change-points, but, importantly, along with a
full assessment of uncertainty, in the form of confidence distributions.
A fruitful statistical framework is as follows. Suppose y
1
, . . . , y
n
are independent
from a model with density say f(y, θ), with θ of dimension say p. Our theme is that
of pinpointing and providing full inference for the break point τ, assumed to exist,
where the θ associated with y
1
, . . . , y
τ
is equal to one value, say θ
L
, whereas the
parameter vector behind y
τ+1
, . . . , y
n
, say θ
R
, is different. For various applications
it may be necessary to extend this framework to models with dependence, as for
time series, and several of our methods work also for such cases. The statistical
challenge is to estimate τ, along with measures of uncertainty. The traditional ways
of reporting precision of parameter estimates are via standard errors (estimates
of standard deviation) or say 95% confidence intervals. Our preferred format is
that of a full confidence curve, say cc(τ, y
obs
), based on the observed dataset y
obs
.
Its interpretation is that, at the true change-point parameter τ, the set R(α) =
{τ : cc(τ, Y ) ≤ α} ought to have probability approximately equal to α, with Y
denoting a random dataset drawn from the model; see Schweder & Hjort (2016)
for a full account of confidence distributions. In particular, confidence sets at any
confidence level can be read off from the confidence curve.
The theory and applications of confidence distributions work out more easily for
continuous parameters in smooth models, for several reasons. First, for a continuous
parameter there is then a possibility of having exact or nearly exact confidence dis-
tributions, in the sense that R(α) given above has probability equal to or very close
to α, for each confidence level α. This is not fully attainable for the present case
of change-point parameters, as the natural statistics informative for τ , like a point
estimator bτ, have discrete distributions. Secondly, various methods and results per-
taining to continuous parameters of smooth models, related to exact or approximate
distributions for such statistics, like large-sample normality or chi-squaredness of
deviances, are not valid and have no clear parallels when it comes to inference for τ .
Confidence distributions and confidence curves may nevertheless be fruitfully con-
structed for various situations with discrete parameters, as developed in Schweder
& Hjort (2016, Ch. 3). This is also the line of development and investigation for the
present paper.
In Sections 2 and 3 we propose two different general methods for obtaining such
confidence curves for change-points. The first of these requires having a homogeneity
test for each given segment of data points where the hypothesis of no change can be
accurately examined. For this reason we develop classes of general goodness-of-fit
tests for such homogeneity hypotheses in Section 4. Tests we develop there, based on